In this section of the lab, you will be asked to apply what you have learned to create a RNN model that can generate new sequences of text based on what it has learned from a large set of existing text. In this case we will be using the full text of Lewis Carroll's Alice in Wonderland. Your task for the assignment is to:
Our previous model based on Obama's essay was prone to overfitting since there was not that much data to learn from. Thus, the generated text was either unintelligeable (not enough learning) or exactly replicated the training data (over-fitting). In this case, we are working with a much bigger data set, which should provide enough data to avoid over-fitting, but will also take more time to train. To improve your model, you can experiment with tuning the following hyper-parameters:
If you get an error such as alloc error
or out of memory error
during training it means that your computer does not have enough RAM memory to store the model parameters or the batch of training data needed during a training step. If you run into this issue, try reducing the complexity of your model (both number and depth of layers) or the mini-batch size.
The last three code blocks will use your trained model to generate a sequence of text based on a predefined seed. Do not change any of the code, but run it before submitting your assignment. Your work will be evaluated based on the quality of the generated text. A good result should be legible with decent grammar and spelling (this indicates a high level of learning), but the exact text should not be found anywhere in the actual text (this indicates over-fitting).
Let's start by importing the libraries we will be using, and importing the full text from Alice in Wonderland:
In [ ]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
from time import gmtime, strftime
import os
import re
import pickle
import random
import sys
In [ ]:
filename = "data/wonderland.txt"
raw_text = open(filename).read()
raw_text = re.sub('[^\nA-Za-z0-9 ,.:;?!-]+', '', raw_text)
raw_text = raw_text.lower()
n_chars = len(raw_text)
print "length of text:", n_chars
print "text preview:", raw_text[:500]
In [ ]:
# write your code here
Do not change this code, but run it before submitting your assignment to generate the results
In [ ]:
def sample(preds, temperature=1.0):
preds = np.asarray(preds).astype('float64')
preds = np.log(preds) / temperature
exp_preds = np.exp(preds)
preds = exp_preds / np.sum(exp_preds)
probas = np.random.multinomial(1, preds, 1)
return np.argmax(probas)
In [ ]:
def generate(sentence, sample_length=50, diversity=0.35):
generated = sentence
sys.stdout.write(generated)
for i in range(sample_length):
x = np.zeros((1, X.shape[1], X.shape[2]))
for t, char in enumerate(sentence):
x[0, t, char_to_int[char]] = 1.
preds = model.predict(x, verbose=0)[0]
next_index = sample(preds, diversity)
next_char = int_to_char[next_index]
generated += next_char
sentence = sentence[1:] + next_char
sys.stdout.write(next_char)
sys.stdout.flush()
print
In [ ]:
prediction_length = 500
seed = "this time alice waited patiently until it chose to speak again. in a minute or two the caterpillar t"
generate(seed, prediction_length, .50)